Generation of Personalized MPEG-4 compliant Talking Heads
نویسندگان
چکیده
This paper studies a new method for three-dimensional (3D) facial model adaptation and its integration into a Text-to-Speech (TTS) system. The TTS System pronounces, in real time, English or Greek speech and simultaneously animates the adapted face model, thus simulating a natural talking face. The 3D facial adaptation requires a set of two orthogonal views of the user’s face with a number of feature points located on both views. Based on the correspondences of the feature points’ positions a generic face model is deformed non-rigidly treating every facial part (i.e. nose, mouth, etc) as a separate entity. A cylindrical texture map is then built from the two image views covering the whole area of the head by exploiting the inherent face symmetry. The result is a complete, textured model of a specific person’s head. The generated 3-D models are then integrated into a talking head system, which consists of two distinct parts: a multi-lingual Text To Speech sub-system and a Facial Animation sub-system based on MPEG-4 Facial Animation Parameters (FAPs). Support for the Greek language has been added to both the TTS and the facial animation sub-system while preserving lip and speech synchronization. Key-Words: -MPEG-4, 3D model based coding, Text to Speech, facial adaptation, talking face
منابع مشابه
Talking Head: Synthetic Video Facial Animation in MPEG-4
We present a system for facial modeling and animation that aims at the generation of photo-realistic models and performance driven animation. It is practical implementation of MPEG-4 compliant Synthetic Video Facial Animation pipeline (Simple and Calibration Profiles with some modifications), which includes: facial features recognition & tracking on real video sequence; obtaining, encoding, net...
متن کاملA MPEG-4 Virtual Human Animation Engine for Interactive Web Based Applications
This paper presents a novel, MPEG-4 compliant animation engine (body player). It has been designed to synthesize virtual human full-body animations in interactive multimedia applications for the web. We believe that a full-body player can provide a more expressive and interesting interface than the use of animated faces only (talking heads). This is one of the first implementations of a MPEG-4 ...
متن کاملCompression of MPEG-4 facial animation parameters for transmission of talking heads
The emerging MPEG-4 standard supports the transmission and composition of facial animation with natural video. The new standard will include a facial animation parameter (FAP) set that is defined based on the study of minimal facial actions and is closely related to muscle actions. The FAP set enables model-based representation of natural or synthetic talking-head sequences and allows intelligi...
متن کاملThree-Dimensional Facial Adaptation for MPEG-4 Talking Heads
This paper studies a new method for three-dimensional (3D) facial model adaptation and its integration into a text-to-speech (TTS) system. The 3D facial adaptation requires a set of two orthogonal views of the user’s face with a number of feature points located on both views. Based on the correspondences of the feature points’ positions, a generic face model is deformed nonrigidly treating ever...
متن کاملA text-speech synchronization technique with applications to talking heads
In human communication, speech understanding is greatly improvedby the bimodal acoustic-visual effect with respect to simple speech communication, in particular when the communication takes place in noisy environments. In this paper we propose a novel synchronization procedure between text and speech, to reduce the time consumption in the development of friendly audio--visual interfaces or auth...
متن کامل